Authors: Ziyue Xiao(ziyuex), Minxue gu(minxue)
Based on the locations of fast food restaurants, we . And we made a buffer within the range of 1500 miles to inidacate whether
In this graph, we examine differences in obesity rates among ethnic and income groups. It can be seen that with the increase of income, the overall obesity rate decreases, which may qualitatively mean that income is related to obesity, and the relatively low price of fast food may be the driving factor behind it, which may affect the food choices of people with different incomes. At the same time, in Los Angeles, different ethnic groups have different tendencies to fast food, because the probability of obesity varies to some extent with the same income, but the difference decreases with the increase of the overall income, indicating that income may play a more important role in the obesity rate.
In this graph we examine differences in obesity rates between ethnic groups and people who live near or far away from fast food restaurants.It can be seen that the obesity rate of people who live closer to fast food restaurants is generally higher than that of people who live far away from fast food restaurants. This may mean that the convenience of fast food restaurants has an impact on obesity qualitatively, and people may choose fast food because it is a convenient and fast way to get food.It’s worth noting that Blacks and Whites were most affected by proximity to fast food restaurants, while Asians were less affected。
In this graph we examine differences in obesity rates between income groups and people who live near or far away from fast food restaurants.In addition to the previous relationship between income and obesity rate, we can see that in the high-income group, the obesity rate of people near fast food restaurants is higher than that of people far from fast food restaurants, while in the low-income group, the difference is not so obvious, or even the opposite, which may indicate that the obesity rate of low-income people may be related to other factors.
##
## Call:
## lm(formula = obesityRate ~ hasAccessToFastFood, data = obesity_access)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.1228 -4.2228 -0.4228 4.1772 22.8772
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 26.2228 0.1313 199.68 <2e-16 ***
## hasAccessToFastFood 2.6763 0.2337 11.45 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.236 on 2322 degrees of freedom
## Multiple R-squared: 0.05347, Adjusted R-squared: 0.05306
## F-statistic: 131.2 on 1 and 2322 DF, p-value: < 2.2e-16
This analysis compared the obesity rates of people who did not have access to fast food restaurants (access= 0) with those who did have access to fast food restaurants (access= 1). From the summary, the regression coefficient is 2.6763, which is the slope of the line or the difference in obesity rate between the two average scenario.This verifies the effect of proximity to fast food restaurants on obesity rates as previously seen in the equity analysis.
##
## Call:
## lm(formula = obesityRate ~ householdIncome, data = obesity_access_race)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.871 -2.924 0.044 3.026 18.381
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.177e+01 8.639e-02 367.72 <2e-16 ***
## householdIncome -6.580e-05 1.030e-06 -63.86 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.428 on 11458 degrees of freedom
## (6 observations deleted due to missingness)
## Multiple R-squared: 0.2625, Adjusted R-squared: 0.2624
## F-statistic: 4078 on 1 and 11458 DF, p-value: < 2.2e-16
This graph plots the relationship between income and obesity rate, we use ‘loess’ instead of ‘lm’ to draw a smooth curve. From the summary, we can see the tendency that a dollar raise in the income will lower the obesity rate by 6.580e-03%.The p value is small and the results are statistically significant.
##
## Call:
## lm(formula = obesityRate ~ Race, data = obesity_access_race)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.610 -3.885 -0.323 3.791 23.363
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 25.1852 0.1272 198.016 < 2e-16 ***
## RaceBlack 3.2378 0.2018 16.047 < 2e-16 ***
## RaceHispanic 2.2243 0.1673 13.294 < 2e-16 ***
## RaceNative American 3.4700 0.6271 5.533 3.22e-08 ***
## RaceOther 3.3037 0.1760 18.775 < 2e-16 ***
## RacePacific Islander 2.9421 1.5209 1.934 0.0531 .
## RaceTwo Or More 0.8496 0.2179 3.899 9.72e-05 ***
## RaceWhite 1.8521 0.1651 11.221 < 2e-16 ***
## RaceWhite Non-Hispanic 0.5516 0.1727 3.195 0.0014 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.027 on 11451 degrees of freedom
## (6 observations deleted due to missingness)
## Multiple R-squared: 0.04998, Adjusted R-squared: 0.04931
## F-statistic: 75.3 on 8 and 11451 DF, p-value: < 2.2e-16
Asian was taken alphabetically as a “baseline” against which other categories were compared.The regression coefficient by race category can be interpreted as the difference in obesity rate relative to Asians. It can be seen that the obesity rate of other races is higher than that of Asians, while the obesity rate of whites is lower and that of blacks is higher.While most of these associations were statistically significant, overall, racial differences explained only about 5 percent of the variation in obesity rates. # Multiple regression ————————————————- #Analyze the association between obesity with neighborhood income, ethnic composition and working hours in fast-food supply areas.
##
## Call:
## lm(formula = obesityRate ~ householdIncome + race + WKHP, data = obesity_access_race_True)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.2012 -3.2199 -0.1664 3.4465 19.2382
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.246e+01 7.353e-03 4414.421 <2e-16 ***
## householdIncome -7.475e-05 6.081e-08 -1229.244 <2e-16 ***
## raceblack 1.571e+00 8.201e-03 191.496 <2e-16 ***
## racehispanic 7.176e-01 7.179e-03 99.965 <2e-16 ***
## raceother 6.890e-01 6.349e-03 108.519 <2e-16 ***
## racewhite 8.068e-01 7.073e-03 114.061 <2e-16 ***
## WKHP 3.063e-05 9.206e-05 0.333 0.739
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.522 on 5486548 degrees of freedom
## Multiple R-squared: 0.2401, Adjusted R-squared: 0.2401
## F-statistic: 2.889e+05 on 6 and 5486548 DF, p-value: < 2.2e-16
The model examined the relationship between obesity rates and income, race and hours worked in the presence of nearby fast food restaurants.The data showed that when other factors were held constant, the longer a person worked, the more likely they were to eat fast food, possibly because long working hours compressed leisure time and people turned to fast food in the presence of fast food restaurants. At the same time, a dollar raise in the income will lower the obesity rate by 7.475e-03%, which suggests after controlling for other factors, the influence of income on obesity rate is stronger than before. While the relationship between race and obesity rates was similar to what had been seen before.
Build a predictive model in which obesity is the predicted variable that can take neighborhood income, ethnic composition as predictors and predict the presence of obesity.
##
## Call:
## glm(formula = obesity_risk_population ~ race + householdIncome,
## family = quasibinomial(), data = predict)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8196 -0.8163 -0.4309 0.9548 2.1414
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.556e+00 6.477e-02 24.02 <2e-16 ***
## race -1.060e-02 7.627e-03 -1.39 0.165
## householdIncome -4.069e-05 9.949e-07 -40.90 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for quasibinomial family taken to be 0.8523689)
##
## Null deviance: 13326 on 11465 degrees of freedom
## Residual deviance: 11001 on 11463 degrees of freedom
## AIC: NA
##
## Number of Fisher Scoring iterations: 5
Here we create a variable (obesity risk) that is 1 if the household is close to a fast food restaurant and has a household income of less than 90,000, and 0 if it is not.
## # A tibble: 1 × 10
## GEOID Race householdIncome householdIncome… obesityRate hasAccessToFast…
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 06037295103 Two Or More 170871 50319 22.9 0
## # … with 4 more variables: TotalPopulation <dbl>, income <chr>, race <dbl>,
## # obesity_risk_population <dbl>
## 1
## 0.004189577
##
## . FALSE TRUE
## 0 2889 5506
## 1 0 3071
The bottom-right cell is the number of households at risk for obesity with an income of less than 90,000 and have access to fast food, and the model correctly predicted the result using variables of and income, race as predictors. The top-left cell is the number of households we assumed not at risk of lead, and the model predicted the same result. So 66% of records were correctly predicted one way or the other. The top-right cell is the number of households we assumed not at risk of obesity but the model incorrectly predicted them to be at risk which is “false positives”. The bottom-left cell is the number of households who are actually at risk for lead exposure, but the model incorrectly predicted them to be safes which is "false negative